Performance on the test was generally strong. The mean was 93.3 and the standard deviation 4.7.
Here is the distribution of results.
Friday, September 19, 2014
Performance on the test was generally strong. The mean was 93.3 and the standard deviation 4.7.
Here is the distribution of results.
Here is a display of the number of wrong responses, by question number. As you can see, items 22,23, and 24 were the most difficult.
Consider two columns of numbers, \(X\) and \(Y\). The sum of cross-products of deviation scores of the numbers is equal to
\[ \sum_{i=1}^{N}\left( X_{i}-\overline{X}_{\bullet }\right) \left( Y_{i}-% \overline{Y}_{\bullet }\right) =SCP \]
The correct answer was "D" because \(SCP\) is always equal to all of the following quantities:
\(\sum_{i=1}^{N}\left( X_{i}-\overline{X}_{\bullet }\right) \left( Y_{i}\right) \)
\(\sum_{i=1}^{N}\left( X_{i}\right) \left( Y_{i}-\overline{Y}_{\bullet }\right) \)
\(\sum_{i=1}^{N}X_{i}Y_{i}-\left( \sum_{i=1}^{N}X_{i}\sum_{i=1}^{N}Y_{i}\right) /N\)
It is easy to demonstrate numerically with R that the 4 formulas always seem to yield the same value for any data set.
While this isn't a proof, you can run the routine on the next page dozens of times with different sample sizes and you'll keep getting all 4 numbers the same. This gives you a very strong hint about the answer!
Here is a function that computes all 4 quantities on sets of random numbers. Run it as many times as you wish.
set.seed(12345) n <- 10 compareSCP <- function(n){ X <- rnorm(n); Y <- rnorm(n); Xbar <- mean(X);Ybar <- mean(Y) SCP <- sum((X-Xbar)*(Y-Ybar)) QuantityA <- sum(Y*(X-Xbar)) QuantityB <- sum(X*(Y-Ybar)) QuantityC <- sum(X*Y) - sum(X)*sum(Y)/n print(c(SCP,QuantityA,QuantityB,QuantityC)) } compareSCP(10)
## [1] -1.675 -1.675 -1.675 -1.675
You can also prove it analytically. The third formula is the computational formula for SCP given in the class notes for covariance.
Moreover, if either of the first two formulas is correct, the other must be correct, because which column of numbers is designated to be \(X\) and which is designated to be \(Y\) is arbitrary.
We'll show with summation algebra that the second formula is equal to \(SCP\).
\[\sum_{i=1}^{N}(X_i - \overline{X}_\bullet)(Y_i - \overline{Y}_\bullet) = \sum_{i=1}^{N}(X_i(Y_i - \overline{Y}_\bullet) - \overline{X}_\bullet(Y_i - \overline{Y}_\bullet))\]
By the distributive rule, the right side is equal to
\[\sum_{i=1}^{N}X_i(Y_i - \overline{Y}_\bullet) - \sum_{i=1}^{N}\overline{X}_\bullet(Y_i - \overline{Y}_\bullet)\]
By the second constant rule, the far right term simplifies, yielding
\[\sum_{i=1}^{N}X_i(Y_i - \overline{Y}_\bullet) - \overline{X}_\bullet\sum_{i=1}^{N}(Y_i - \overline{Y}_\bullet)\]
Now we see the the far right term involves the sum of \(Y\) deviations, which is always equal to zero. So the entire right term drops out, leaving us with the left term
\[\sum_{i=1}^{N}X_i(Y_i - \overline{Y}_\bullet) + 0\]
The answer is "A" because the sample variance cannot be expressed as a linear combination of the scores in \(X\). All the other expressions are linear combinations of the scores in \(X\), as we show below:
Sample mean (all weights \(1/N\)) \[ \overline{X}_\bullet = \sum_{i=1}^{N}\frac{1}{N}X_i \] Sum of \(X_i\) (all weights \(1\)) \[\sum_{i=1}^N X_i = \sum_{i=1}^N (1) X_i\]
Twice the sum of the \(X_i\) \[ 2\sum_{i=1}^N X_i = \sum_{i=1}^N (2) X_i\]
To process this expression, you must examine it very closely.
\[\sum_{j=1}^4 \sum_{i=1}^j X_{ij}\]
Note that we begin by setting \(j=1\), putting us inside the first column. Then we run the row subscript \(i\) from 1 to the current value of \(j\), which is 1. This means that, so far, we have selected \(X_{11}\).
Next, we set \(j = 2\), putting us in the second column, and we run the row subscript from 1 to 2. At this point we've added \(X_{11} + (X_{12} + X_{22})\)
Next we set \(j=3\) and run \(i\) from 1 to 3, etc.
This means that we have added all the upper triangular elements of \(X\), i.e., those elements for which \(j \ge i\). So the result is \[ (X_{11}) + (X_{12} + X_{22}) + (X_{13} + X_{23} + X_{33}) \\ +(X_{14} + X_{24} + X_{34} + X_{44})\]
This is equal to
\[ (3) + (3 + 9) + (8 +8 + 2) + (5 + 9 + 4 + 4) = 55 \]